Skip to content

Conversation

mihailotim-db
Copy link
Contributor

@mihailotim-db mihailotim-db commented Jul 16, 2025

What changes were proposed in this pull request?

This PR adds new functionality to single-pass analyzer and addresses current issues:

  • Adds support for TABLESAMPLE, SemiStructuredExtract, GetJsonObject and JsonTuple
  • Adds support for LCA in Aggregate
  • Run PullOutNondeterministic as a post-resolution rule
  • Move ExplicitlyUnsupportedResolverFeature checks to ResolverGuard where possible
  • Fix NameScope name resolution to match fixed-point's fallback behavior when failing to resolve in one scope
  • Fix type coercion caused by missing TypeCoercionRules
  • Fix ExprId assignment
  • Various compatibility and failure fixes

Why are the changes needed?

To replace the existing Spark Analyzer with the single-pass. one.

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI with ANALYZER_DUAL_RUN_LEGACY_AND_SINGLE_PASS_RESOLVER.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Jul 16, 2025
@mihailotim-db mihailotim-db force-pushed the mihailotim-db/codesync_single_pass branch 5 times, most recently from 6746598 to 958c2d6 Compare July 17, 2025 10:28
@mihailotim-db mihailotim-db changed the title single pass [SPARK-52842][SQL] New functionality and bugfixes for single-pass analyzer Jul 17, 2025
@mihailotim-db mihailotim-db force-pushed the mihailotim-db/codesync_single_pass branch from 958c2d6 to b08cfab Compare July 17, 2025 12:49
@mihailotim-db mihailotim-db force-pushed the mihailotim-db/codesync_single_pass branch from b08cfab to 5293ea6 Compare July 17, 2025 17:11
@cloud-fan
Copy link
Contributor

the docker test failure is unrelated, thanks, merging to master!

@cloud-fan cloud-fan closed this in 2297cf4 Jul 18, 2025
haoyangeng-db pushed a commit to haoyangeng-db/apache-spark that referenced this pull request Jul 22, 2025
…lyzer

### What changes were proposed in this pull request?
This PR adds new functionality to single-pass analyzer and addresses current issues:

- Adds support for `TABLESAMPLE`, `SemiStructuredExtract`, `GetJsonObject` and `JsonTuple`
- Adds support for LCA in Aggregate
- Run `PullOutNondeterministic` as a post-resolution rule
- Move `ExplicitlyUnsupportedResolverFeature` checks to `ResolverGuard` where possible
- Fix `NameScope` name resolution to match fixed-point's fallback behavior when failing to resolve in one scope
- Fix type coercion caused by missing `TypeCoercionRules`
- Fix `ExprId` assignment
- Various compatibility and failure fixes

### Why are the changes needed?
To replace the existing Spark Analyzer with the single-pass. one.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI with ANALYZER_DUAL_RUN_LEGACY_AND_SINGLE_PASS_RESOLVER.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#51513 from mihailotim-db/mihailotim-db/codesync_single_pass.

Authored-by: Mihailo Timotic <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants